智能论文笔记

Roadmap on Signal Processing for Next Generation Measurement Systems

D. K. Iakovidis , M. Ooi , Y. C. Kuang , S. Damidenko , A. Shestakov , V. Sinistin , M. Henry , A. Sciacchitano , A. Discetti , S. Donati

分类：人工智能 | 计算机视觉

2021-11-03

信号处理是几乎任何传感器系统的基本组件，具有不同科学学科的广泛应用。时间序列数据，图像和视频序列包括可以增强和分析信息提取和量化的代表性形式的信号。人工智能和机器学习的最近进步正在转向智能，数据驱动，信号处理的研究。该路线图呈现了最先进的方法和应用程序的关键概述，旨在突出未来的挑战和对下一代测量系统的研究机会。它涵盖了广泛的主题，从基础到工业研究，以简明的主题部分组织，反映了每个研究领域的当前和未来发展的趋势和影响。此外，它为研究人员和资助机构提供了识别新前景的指导。

translated by 谷歌翻译

Deep Learning for Space Weather Prediction: Bridging the Gap between Heliophysics Data and Theory

John C. Dorelli , Chris Bard , Thomas Y. Chen , Daniel Da Silva , Luiz Fernando Guides dos Santos , Jack Ireland , Michael Kirk , Ryan McGranaghan , Ayris Narock , Teresa Nieves-Chinchilla

分类：机器学习

2022-12-27

Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.

translated by 谷歌翻译

Artificial Intelligence to Enhance Mission Science Output for In-situ Observations: Dealing with the Sparse Data Challenge

M. I. Sitnov , G. K. Stephens , V. G. Merkin , C. -P. Wang , D. Turner , K. Genestreti , M. Argall , T. Y. Chen , A. Y. Ukhorskiy , S. Wing

分类：机器学习

2022-12-26

In the Earth's magnetosphere, there are fewer than a dozen dedicated probes beyond low-Earth orbit making in-situ observations at any given time. As a result, we poorly understand its global structure and evolution, the mechanisms of its main activity processes, magnetic storms, and substorms. New Artificial Intelligence (AI) methods, including machine learning, data mining, and data assimilation, as well as new AI-enabled missions will need to be developed to meet this Sparse Data challenge.

translated by 谷歌翻译

Source-Free Domain Adaptation for Question Answering with Masked Self-training

M. Yin , B. Wang , Y. Dong , C. Ling

分类：自然语言处理

2022-12-19

Most previous unsupervised domain adaptation (UDA) methods for question answering(QA) require access to source domain data while fine-tuning the model for the target domain. Source domain data may, however, contain sensitive information and may be restricted. In this study, we investigate a more challenging setting, source-free UDA, in which we have only the pretrained source model and target domain data, without access to source domain data. We propose a novel self-training approach to QA models that integrates a unique mask module for domain adaptation. The mask is auto-adjusted to extract key domain knowledge while trained on the source domain. To maintain previously learned domain knowledge, certain mask weights are frozen during adaptation, while other weights are adjusted to mitigate domain shifts with pseudo-labeled samples generated in the target domain. %As part of the self-training process, we generate pseudo-labeled samples in the target domain based on models trained in the source domain. Our empirical results on four benchmark datasets suggest that our approach significantly enhances the performance of pretrained QA models on the target domain, and even outperforms models that have access to the source data during adaptation.

translated by 谷歌翻译

Characterizing instance hardness in classification and regression problems

Gustavo P. Torquette , Victor S. Nunes , Pedro Y. A. Paiva , Lourenço B. C. Neto , Ana C. Lorena

分类：机器学习

2022-12-04

Some recent pieces of work in the Machine Learning (ML) literature have demonstrated the usefulness of assessing which observations are hardest to have their label predicted accurately. By identifying such instances, one may inspect whether they have any quality issues that should be addressed. Learning strategies based on the difficulty level of the observations can also be devised. This paper presents a set of meta-features that aim at characterizing which instances of a dataset are hardest to have their label predicted accurately and why they are so, aka instance hardness measures. Both classification and regression problems are considered. Synthetic datasets with different levels of complexity are built and analyzed. A Python package containing all implementations is also provided.

translated by 谷歌翻译

A Machine Learning Approach to Solving Large Bilevel and Stochastic Programs: Application to Cycling Network Design

Timothy C. Y. Chan , Bo Lin , Shoshanna Saxe

分类：机器学习

2022-09-20

我们提出了一种基于机器学习的新型方法来解决涉及大量独立关注者的二重性程序，作为一种特殊情况，其中包括两阶段随机编程。我们提出了一个优化模型，该模型明确考虑了追随者的采样子集，并利用机器学习模型来估计未采样关注者的客观值。与现有方法不同，我们将机器学习模型培训嵌入到优化问题中，这使我们能够采用无法使用领导者决策来表示的一般追随者功能。我们证明了由原始目标函数衡量的生成领导者决策的最佳差距，该目标函数考虑了整个追随者集。然后，我们开发追随者采样算法来收紧界限和一种表示追随者功能的表示方法，可以用作嵌入式机器学习模型的输入。使用骑自行车网络设计问题的合成实例，我们比较方法的计算性能与基线方法。我们的方法为追随者的目标价值观提供了更准确的预测，更重要的是，产生了更高质量的领导者决策。最后，我们对骑自行车基础设施计划进行了现实世界中的案例研究，我们采用方法来解决超过一百万关注者的网络设计问题。与当前的自行车网络扩展实践相比，我们的方法提出了有利的性能。

translated by 谷歌翻译

Self-Supervised Clustering on Image-Subtracted Data with Deep-Embedded Self-Organizing Map

Y. -L. Mong , K. Ackley , T. L. Killestein , D. K. Galloway , M. Dyer , R. Cutter , M. J. I. Brown , J. Lyman , K. Ulaczyk , D. Steeghs

分类：计算机视觉

2022-09-14

开发有效的自动分类器将真实来源与工件分开，对于宽场光学调查的瞬时随访至关重要。在图像差异过程之后，从减法伪像的瞬态检测鉴定是此类分类器的关键步骤，称为真实 - 博格斯分类问题。我们将自我监督的机器学习模型，深入的自组织地图（DESOM）应用于这个“真实的模拟”分类问题。 DESOM结合了自动编码器和一个自组织图以执行聚类，以根据其维度降低的表示形式来区分真实和虚假的检测。我们使用32x32归一化检测缩略图作为底部的输入。我们展示了不同的模型训练方法，并发现我们的最佳DESOM分类器显示出6.6％的检测率，假阳性率为1.5％。 Desom提供了一种更细微的方法来微调决策边界，以确定与其他类型的分类器（例如在神经网络或决策树上构建的）结合使用时可能进行的实际检测。我们还讨论了DESOM及其局限性的其他潜在用法。

translated by 谷歌翻译

Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics

C. Chen , Y. P. Huang , W. H. K. Lam , T. L. Pan , S. C. Hsu , A. Sumalee , R. X. Zhong

分类：机器学习

2022-09-13

现有的数据驱动和反馈流量控制策略不考虑实时数据测量的异质性。此外，对于缺乏数据效率，传统的加固学习方法（RL）方法通常会缓慢收敛。此外，常规的最佳外围控制方案需要对系统动力学的精确了解，因此对内源性不确定性会很脆弱。为了应对这些挑战，这项工作提出了一种基于不可或缺的增强学习（IRL）的方法来学习宏观交通动态，以进行自适应最佳周边控制。这项工作为运输文献做出了以下主要贡献：（a）开发连续的时间控制，并具有离散增益更新以适应离散时间传感器数据。（b）为了降低采样复杂性并更有效地使用可用数据，将体验重播（ER）技术引入IRL算法。（c）所提出的方法以“无模型”方式放松模型校准的要求，该方式可以稳健地进行建模不确定性，并通过数据驱动的RL算法增强实时性能。（d）通过Lyapunov理论证明了基于IRL的算法和受控交通动力学的稳定性的收敛性。最佳控制定律被参数化，然后通过神经网络（NN）近似，从而缓解计算复杂性。在不需要模型线性化的同时，考虑了状态和输入约束。提出了数值示例和仿真实验，以验证所提出方法的有效性和效率。

translated by 谷歌翻译

Model interpretation using improved local regression with variable importance

Gilson Y. Shimizu , Rafael Izbicki , Andre C. P. L. F. de Carvalho

分类： (统计)机器学习 | 机器学习

2022-09-12

关于使用ML模型的一个基本问题涉及其对提高决策透明度的预测的解释。尽管已经出现了几种可解释性方法，但已经确定了有关其解释可靠性的一些差距。例如，大多数方法都是不稳定的（这意味着它们在数据中提供了截然不同的解释），并且不能很好地应对无关的功能（即与标签无关的功能）。本文介绍了两种新的可解释性方法，即Varimp和Supclus，它们通过使用局部回归拟合的加权距离来克服这些问题，以考虑可变重要性。 Varimp生成了每个实例的解释，可以应用于具有更复杂关系的数据集，而Supclus解释了具有类似说明的实例集群，并且可以应用于可以找到群集的较简单数据集。我们将我们的方法与最先进的方法进行了比较，并表明它可以根据几个指标产生更好的解释，尤其是在具有无关特征的高维问题中，以及特征与目标之间的关系是非线性的。

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译